Analysis of Obstacles in American Ninja Warrior¶

By Shayna Hay and Robert Wilson¶

For the past 13 seasons, the American sports entertainment reality show American Ninja Warrior has been taking the world by storm. The show brings on a variety of different contestents and challenges them to complete obstacles courses at various rounds and stages. If the individual passes the round, they will move onto the next round. Well, we are planning on going on American Ninja Warrior and are trying to get a step up on the competition. Is there a way to predict the obstacles we will see on the courses? Are there certain rounds where different types of obstacles are possible? Well, lets find out.

Throughout this tutorial, we are going to attempt to find trends within different obstacles on American Ninja Warrior in hope to find the best way to predict obstacles we might see and their frequency in future seasons.

Tools¶

For this tutorial, you will need the following libraries:

1. pandas
2. matplotlib.pyplot
3. seaborn
4. sklearn
5. sklearn.linear_model
6. statsmodel.formula.api

Motivation¶

Here we are importing all of the necesary libraries and scraping the dataset. Go to https://data.world/ninja/anw-obstacle-history and download the dataset from there. We will then be able to scrape the table utilizing the code below.

In [25]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn import linear_model
from sklearn import datasets
from sklearn import metrics 
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
from statsmodels.formula.api import ols
In [26]:
data = pd.read_excel('American Ninja Warrior Obstacle History.xlsx')
data.head()
Out[26]:
Season Location Round/Stage Obstacle Name Obstacle Order
0 1 Venice Qualifying Quintuple Steps 1
1 1 Venice Qualifying Rope Swing 2
2 1 Venice Qualifying Rolling Barrel 3
3 1 Venice Qualifying Jumping Spider 4
4 1 Venice Qualifying Pipe Slider 5

Here we have a dataset that we were able to get from awesome-public-datasets on GitHub. Above is the first couple of rows from the dataset. This dataset contains all of the different obstacles that were present in the first 10 seasons of American Ninja Warrior. Within each row, we can see the name of the obstacle, the season the obstacle was in, the location of the round, the round/stage, and the number obstacle it was on the course. So to begin, we first want to see if there is even anything to be looking at. If all obstacles are only seen once, then there is no reason to be trying to plan what obstacles we might see.

In [27]:
#creating new dataframe to add obstacles to 
obstacles = pd.DataFrame(columns=['obstacles','count'])
#grouping by obstacle
groups = data.groupby(['Obstacle Name'])

#looping through and adding the length of each obstacle chart and obstacle to the new dataframe
for name, group in groups:
    if len(group)>10:
        obstacles.loc[len(obstacles.index)] = [name, len(group)] 
    else:
        obstacles.loc[len(obstacles.index)] = ["", len(group)] 

plt.pie(obstacles['count'],labels = obstacles['obstacles'],textprops={'fontsize': 100})
plt.rcParams["figure.figsize"]=(200,200)
plt.show()

In order to see the most occuring obstaces, we decided to create a pie chart. We wanted the pie chart to represent each individual obstacle and the number of times it has been seen. We grouped the dataset by the obstacle name and then added the name and length of each individual group to a new dataframe. This allowed us to create a new dataframe containing the count of each obstacle. While this was only to see if there is any motivation for analyzing this further, we were still interested to see which obstacles have the most occurances. Therefore we labeled every obstacle that had an occurance greater than 10. When looking at the pie chart, we can see that not all "slices" are the same. This was able to show us that there are some obstacles which appear more often than others and after further analysis, these might be obstacles of importance for us. This allows us to believe that we might be able to predit certain obstacles.

Now that we can see certain obstacles occur more often overall, what about season to season? Do the creators of American Ninja Warrior reuse obstacles in the same season? If we find this out, we can try to see if there are certain obstacles we should practice more because they might appear in multiple rounds. You can read more about how to create pie charts here https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.pie.html.

In [28]:
#grouping by season
groups = data.groupby(['Season'])

#creating multiple plots
fig, ax = plt.subplots(10,figsize=(30, 30))
count = 0
# going through each season
for name, group in groups:
    frame = pd.DataFrame(columns=['Obstacle Name','count'])
    obstacles = group.groupby(['Obstacle Name'])
    
    
    #filling dataframe with obstacle information
    for name1, group1 in obstacles:
        if len(group1)>5:
            frame.loc[len(frame.index)] = [name1,len(group1)]
        else:
            frame.loc[len(frame.index)] = ["",len(group1)]
            
    ax[count].pie(frame['count'], labels=frame['Obstacle Name'])
    ax[count].set_title(name)
    count= count +1

In order to see if there are certain obstacles that occur multiple times in a season, we decided to create multiple different pie charts representing the amount of times each obstacle occurs in each season. We then also wanted to see which of these obstacles occur the most. Therefore we grouped the original dataset by season and then each season by obstacle to get the count of each obstacle. We then created a piechart on each obstacle and count and labeled the slices with a count of 5 or higher. When looking at the pie charts, we can see that the seasons started consistantly repeating obstacles after the first three seasons. We can see that the Warped Wall, Salmon Latter, Quintuple Steps and Floating Steps seem to occur multiple times in multiple seasons. This could be a possible indicator that we might see these multiple times in future seasons and should really focus on perfecting these obstacles.

Now that we can see certain obstacles occur more often overall and different obstacle occurances per season, what about each location? Are there specific locations that utilize certain obstacles more than others? If we find this out, we can try and plan which obstacles we might see given the location of the season we will be on.

In [29]:
groups = data.groupby(['Location'])

fig, ax = plt.subplots(30,figsize=(40, 40))
count = 0
# going through each location
for name, group in groups:
    frame = pd.DataFrame(columns=['Obstacle Name','count'])
    obstacles = group.groupby(['Obstacle Name'])
    
    
    #filling dataframe with obstacle information
    for name1, group1 in obstacles:
        if len(group1)>=3:
            frame.loc[len(frame.index)] = [name1,len(group1)]
        else:
            frame.loc[len(frame.index)] = ["",len(group1)]
            
    ax[count].pie(frame['count'], labels=frame['Obstacle Name'])
    ax[count].set_title(name)
    count= count +1

In order to see if there are certain obstacles that occur more in a certain location, we decided to create multiple different pie charts representing the amount of times each obstacle occurs in each location. We then also wanted to see which of these obstacles occur the most. Therefore we grouped the original dataset by location and then each location by obstacle to get the count of each obstacle. We then created a piechart on each obstacle and count and labeled the slices with a count of 3 or higher. When looking at the pie charts, we can see that majority of the locations do not have repeat obstacles other than Las Vegas, Los Angeles, Miami, and Venice. It is clear that Los Vegas had the most. While this output could be related to creators really spacing out the obstacles per location, but it also could that this is where the national finals is always filmed every season.

Now that we can we the obstacles as a whole across seasons and locations, we can narrow in and look at specific obstacles for each round. We know that if we do not make it past the Qualifiers and Semi-Finals, there is no way we can make it to the National Finals. If we can identify the most common obstacles per round, we can see what we should focus on in hope to pass the beginning rounds and even have the chance to make it further.

In [30]:
groups = data.groupby(['Round/Stage'])

fig, ax = plt.subplots(8,figsize=(40, 40))
count = 0
# going through each location
for name, group in groups:
    frame = pd.DataFrame(columns=['Obstacle Name','count'])
    obstacles = group.groupby(['Obstacle Name'])
    
    
    #filling dataframe with obstacle information
    for name1, group1 in obstacles:
        if len(group1)>=5:
            frame.loc[len(frame.index)] = [name1,len(group1)]
        else:
            frame.loc[len(frame.index)] = ["",len(group1)]
            
    ax[count].pie(frame['count'], labels=frame['Obstacle Name'])
    ax[count].set_title(name)
    count= count +1

In order to see if there are any obstacles that are more common in certain rounds, we decided to create multiple pie charts representing the count of each obstacle at each round. Therefore we grouped the dataset by round and then each round by obstacle to get the count of each obstacle. We then created a pie chart on each obstacle and count and labeled the slices of a count greater than or equal to 5. When looking at the pie charts, we can see that the finals seems to repeat obstacles a lot more than semi-finals and qualifying. This could possibly tell us that it might be harder to predict the earlier rounds, but easier to predict the later rounds.

Motivation Conclusion: From looking at all of the different pie charts above, whether obstacles each season, obstacles each location, or obstacles each round, we can clearly see that there are some obstacles that appear more than others. When identifying all of the different groups, it seems that we might be able to better predict a future season of American Ninja Warrior based on the prior seasons rather than rounds or location specifically. That being said, we are going to start looking at the counts of each obstacle per season and see if obstacles are increasing in count or decreasing. We will then attempt to use this to predict the next season of American Ninja Warrior.

Exploration¶

Based on all of the potential ways an obstacle can be chosen that we have seen above, we have decided that the best way to predict the obstacles we will see in future seasons is based on the past seasons. Therefore our null hypothesis will be that we are not able to predict the count per major obstacle for future seasons. Our alternative hypothesis will be that we are able to predict the count of each obstacle for future seasons.

If we want to explore the different obstacles and their count per season to predict future seasons, we want to only use obstacles that are relevant. We decided that it would be best to remove obstacles as a whole if they have been seen in total less than 10 times. This will remove any obstacles in which they could have only tested out a few times or in which we do not have enough information on.

In [31]:
groups = data.groupby(['Obstacle Name'])
delete = []

# finding which obstacles to delete
for name, group in groups:
    if len(group) < 10:
        delete.append(name)
        
#deleting obstacles
data.drop(data[data['Obstacle Name'].isin(delete)].index, inplace = True)

After removing Obstacles that have been deemed irrelevant, we can now get the count of each obstacle each season. We will do this by grouping by season and obstacle, and then adding the season, obstacle, and count to a new dataframe. In addition, we will be going through each season and adding obstacles that are not present that season with a count of 0. This will allow us to analyze each obstacle individually.

In [32]:
groups = data.groupby(['Season'])
names = data.groupby(['Obstacle Name'])

frame = pd.DataFrame(columns=['Season','ObstacleName','count'])
obs = []

#getting all unique obstacles
for name, group in names:
    obs.append(name)

# going through each season
for name, group in groups:
    obstacles = group.groupby(['Obstacle Name'])

    #filling dataframe with obstacle information
    for name1, group1 in obstacles:
        frame.loc[len(frame.index)] = [name,name1,len(group1)]
        
        #adding obstacles with count as 0 if not in season
    needed = set(obs)-set(group['Obstacle Name'])
    for value in needed:
        frame.loc[len(frame.index)] = [name,value,0]
pd.set_option('display.max_rows',None)
frame
Out[32]:
Season ObstacleName count
0 1 Jumping Spider 3
1 1 Log Grip 1
2 1 Quintuple Steps 2
3 1 Rope Ladder 1
4 1 Salmon Ladder 1
5 1 Wall Lift 1
6 1 Warped Wall 3
7 1 Invisible Ladder 0
8 1 Bridge of Blades 0
9 1 Quad Steps 0
10 1 Rolling Log 0
11 1 Jump Hang 0
12 1 Floating Steps 0
13 2 Bridge of Blades 2
14 2 Jumping Spider 3
15 2 Quad Steps 2
16 2 Rope Ladder 1
17 2 Salmon Ladder 1
18 2 Wall Lift 1
19 2 Warped Wall 3
20 2 Invisible Ladder 0
21 2 Log Grip 0
22 2 Rolling Log 0
23 2 Jump Hang 0
24 2 Floating Steps 0
25 2 Quintuple Steps 0
26 3 Bridge of Blades 2
27 3 Jump Hang 2
28 3 Jumping Spider 1
29 3 Log Grip 2
30 3 Quad Steps 2
31 3 Rope Ladder 1
32 3 Salmon Ladder 1
33 3 Wall Lift 1
34 3 Warped Wall 3
35 3 Floating Steps 0
36 3 Invisible Ladder 0
37 3 Rolling Log 0
38 3 Quintuple Steps 0
39 4 Bridge of Blades 4
40 4 Jump Hang 12
41 4 Jumping Spider 1
42 4 Log Grip 12
43 4 Quad Steps 12
44 4 Rolling Log 1
45 4 Rope Ladder 1
46 4 Salmon Ladder 6
47 4 Wall Lift 5
48 4 Warped Wall 13
49 4 Invisible Ladder 0
50 4 Floating Steps 0
51 4 Quintuple Steps 0
52 5 Jumping Spider 1
53 5 Quintuple Steps 8
54 5 Rolling Log 2
55 5 Rope Ladder 1
56 5 Salmon Ladder 4
57 5 Wall Lift 1
58 5 Warped Wall 9
59 5 Invisible Ladder 0
60 5 Log Grip 0
61 5 Bridge of Blades 0
62 5 Quad Steps 0
63 5 Jump Hang 0
64 5 Floating Steps 0
65 6 Bridge of Blades 2
66 6 Jump Hang 2
67 6 Jumping Spider 1
68 6 Log Grip 2
69 6 Quintuple Steps 10
70 6 Rolling Log 2
71 6 Rope Ladder 1
72 6 Salmon Ladder 5
73 6 Wall Lift 1
74 6 Warped Wall 11
75 6 Invisible Ladder 0
76 6 Floating Steps 0
77 6 Quad Steps 0
78 7 Invisible Ladder 6
79 7 Jump Hang 2
80 7 Jumping Spider 1
81 7 Log Grip 2
82 7 Quintuple Steps 12
83 7 Rolling Log 2
84 7 Rope Ladder 1
85 7 Salmon Ladder 6
86 7 Wall Lift 1
87 7 Warped Wall 13
88 7 Floating Steps 0
89 7 Quad Steps 0
90 7 Bridge of Blades 0
91 8 Floating Steps 10
92 8 Invisible Ladder 5
93 8 Jumping Spider 1
94 8 Log Grip 2
95 8 Rolling Log 2
96 8 Rope Ladder 1
97 8 Salmon Ladder 5
98 8 Warped Wall 11
99 8 Bridge of Blades 0
100 8 Quad Steps 0
101 8 Wall Lift 0
102 8 Jump Hang 0
103 8 Quintuple Steps 0
104 9 Floating Steps 12
105 9 Jumping Spider 1
106 9 Rolling Log 2
107 9 Rope Ladder 1
108 9 Salmon Ladder 6
109 9 Warped Wall 13
110 9 Invisible Ladder 0
111 9 Log Grip 0
112 9 Bridge of Blades 0
113 9 Quad Steps 0
114 9 Wall Lift 0
115 9 Jump Hang 0
116 9 Quintuple Steps 0
117 10 Floating Steps 6
118 10 Jumping Spider 1
119 10 Rope Ladder 1
120 10 Salmon Ladder 6
121 10 Warped Wall 7
122 10 Invisible Ladder 0
123 10 Log Grip 0
124 10 Bridge of Blades 0
125 10 Quad Steps 0
126 10 Wall Lift 0
127 10 Rolling Log 0
128 10 Jump Hang 0
129 10 Quintuple Steps 0

Now that we have this new dataframe with all of the information we are interested in, we can go ahead and plot all of the points in a linear regression plot. We will have the x-axis be the Season, the y axis be the count, and we will group it by obstacle name. This will allow us to see the count of each obstacle as time goes on.

In [33]:
plt.figure(figsize=(15,8))
sns.lmplot(data=frame,x='Season',y="count",hue='ObstacleName')
Out[33]:
<seaborn.axisgrid.FacetGrid at 0x1de1f74b5e0>
<Figure size 1080x576 with 0 Axes>

While this plot does show us a lot, it seems as if a lot of the Obstacles overlap. This makes it more difficult to see the trend and pull information. Although, based on this we can see that the Salmon Ladder, Floating Steps, and Warped Wall seem to have a strong correlation. Lets dive a little deeper and look at each obstacle individually to get a better idea. We are starting by grouping by ObstacleName and then using those groups wiht linear regression tools to create a plot.The linear regression tools we used were lmplot and regplot. You can read more about lmplot here https://seaborn.pydata.org/generated/seaborn.lmplot.html and more about regplot here https://seaborn.pydata.org/generated/seaborn.regplot.html#seaborn.regplot.

In [34]:
plt.figure(figsize=(15,8))


obstacles = frame.groupby(["ObstacleName"])
group1 = pd.DataFrame()
group2 = pd.DataFrame()
group3 = pd.DataFrame()
group4 = pd.DataFrame()
group5 = pd.DataFrame()
group6 = pd.DataFrame()
group7 = pd.DataFrame()
group8 = pd.DataFrame()
group9 = pd.DataFrame()
group10 = pd.DataFrame()
group11 = pd.DataFrame()
group12 = pd.DataFrame()
group13 = pd.DataFrame()
i = 0
for key, item in obstacles:
    if i == 0:
        group1 = item
    if i == 1:
        group2 = item
    if i == 2:
        group3 = item
    if i == 3:
        group4 = item
    if i == 4:
        group5 = item
    if i == 5:
        group6 = item
    if i == 6:
        group7 = item
    if i == 7:
        group8 = item
    if i == 8:
        group9 = item
    if i == 9:
        group10 = item
    if i == 10:
        group11 = item
    if i == 11:
        group12 = item
    if i == 12:
        group13 = item
    i += 1
sns.regplot(data=group1, x='Season', y='count')
plt.title("Bridge of Blades")
Out[34]:
Text(0.5, 1.0, 'Bridge of Blades')

For the Bridge of Blades, we can see the most count was Season 4 and only went down from there. It has not been seen in any season past 7 and has a negative correlation. This would make us believe that there would be a fairly low count of this obstacle in future seasons.

In [35]:
plt.figure(figsize=(15,8))
sns.regplot(data=group2, x='Season', y='count')
plt.title("Floating Steps")
Out[35]:
Text(0.5, 1.0, 'Floating Steps')

For the Floating Steps, we can see the most count was Season 9 and seems to be fairly new. It has not been seen in any in earlier seasons giving it a positive correlation to later seasons. This would make us believe that there would be a fairly average or higher count of this obstacle in future seasons.

In [36]:
plt.figure(figsize=(15,8))
sns.regplot(data=group3, x='Season', y='count')
plt.title("Invisible Ladder")
Out[36]:
Text(0.5, 1.0, 'Invisible Ladder')

For the Invisible Ladder, we can see the most count was Season 7 and only seemed to be in two seasons with both high counts. This would make us believe that there would be a small chance of seeing the Invisible ladder, but if we do it might have a higher count.

In [37]:
plt.figure(figsize=(15,8))
sns.regplot(data=group4, x='Season', y='count')
plt.title("Jump Hang")
Out[37]:
Text(0.5, 1.0, 'Jump Hang')

For the Jump Hang, we can see the most count was Season 4 with a ridiculously high count. In all other seasons it had a max of 3 where as season 4 had a count of 12. It has a negative correlation between season and count due to season 4. This tells us that it is a toss up of whether we will see Jump Hang in a future season and it would probably be at a lower count if so.

In [38]:
plt.figure(figsize=(15,8))
sns.regplot(data=group5, x='Season', y='count')
plt.title("Jumping Spider")
Out[38]:
Text(0.5, 1.0, 'Jumping Spider')

For the Jumping Spider, we can see that it starts off having a larger count that in future seasons. The first two seasons have a count of 3 whereas every season after only has a count for one. This gives us a negative correlation between count and season but based on seasons 3 through 10, we can expect there to be one of these obstacles in each season in the future.

In [39]:
plt.figure(figsize=(15,8))
sns.regplot(data=group6, x='Season', y='count')
plt.title("Log Grip")
Out[39]:
Text(0.5, 1.0, 'Log Grip')

For the Log Grip, we can see the most count was Season 12 which also had a ridiculously high count. It has been seen in other seasons but not in the past two seasons. This has given this count season trend to be a negative correlation very slightly. This would not give us any indication of what we might see in future seasons.

In [40]:
plt.figure(figsize=(15,8))
sns.regplot(data=group7, x='Season', y='count')
plt.title("Quad Steps")
Out[40]:
Text(0.5, 1.0, 'Quad Steps')

For the Quad Steps, we can see the most count was Season 12 which also had a ridiculously high count. After that this obstacle was not seen at all. This has given this count season trend to be a negative correlation very slightly. Based on this we would not expect to have any count of Quad Steps in future seasons.

In [41]:
plt.figure(figsize=(15,8))
sns.regplot(data=group8, x='Season', y='count')
plt.title("Quintuple Steps")
Out[41]:
Text(0.5, 1.0, 'Quintuple Steps')

For the Quintuple Step, we can see that in the middle seasons it was present in a vary large count. Other than these three seasons, the obstacle was not seen at all, including the most recent three seasons. This does has a 0 correlation for count and season and does not give us any indication on if we will see this obstacle in future seasons.

In [42]:
plt.figure(figsize=(15,8))
sns.regplot(data=group9, x='Season', y='count')
plt.title("Rolling Log")
Out[42]:
Text(0.5, 1.0, 'Rolling Log')

For the Rolling Log, we can see that more recently there has been a higher count. This makes us believe that it is a newer obstacle. It shows a positive correlation between count and season which could indicate that we will see it in a future season.

In [43]:
plt.figure(figsize=(15,8))
sns.regplot(data=group10, x='Season', y='count')
plt.title("Rope Ladder")
Out[43]:
Text(0.5, 1.0, 'Rope Ladder')

For the Rope Ladder, it looks as if there is absolutelt no trend with no correlation for count per season. But we can see that every season has a count of 1. This can cause us to believe in future seasons this obstacle will be seen once a season.

In [44]:
plt.figure(figsize=(15,8))
sns.regplot(data=group11, x='Season', y='count')
plt.title("Salmon Ladder")
Out[44]:
Text(0.5, 1.0, 'Salmon Ladder')

For the Salmon Ladder, we can see the count started low, increased, and then stayed at that point. It has not been seen in any in earlier seasons giving it a positive correlation to later seasons. This would make us believe that there would be a fairly average or higher count of this obstacle in future seasons.

In [45]:
plt.figure(figsize=(15,8))
sns.regplot(data=group12, x='Season', y='count')
plt.title("Wall Lift")
Out[45]:
Text(0.5, 1.0, 'Wall Lift')

For the Wall Lift, it is clear that the highest count was season 4 and after that went lower. All the other times it has been seen it only had a count of 1 and yet it still hasnt been seen in the past 3 seasons. There is a negative correlation for count and season, causing us to expect to not see this obstacle in future seasons.

In [46]:
plt.figure(figsize=(15,8))
sns.regplot(data=group13, x='Season', y='count')
plt.title("Warped Wall")
Out[46]:
Text(0.5, 1.0, 'Warped Wall')

For the Warped Wall, we can see that there is a clear positive correlation for season and count. In addition, there has not been any season in the past that did not have any counts of this obstacle. Therefore this tells us that we can defintiely expect to see the Warped Wall in future seasons and at a very large count.

Now that we have looked at all of the different obstacles separatly, we have a idea of what we expect to see. There were a variety of obstacles that had no trend which made it very diffifcult to predict what we might see in the future. Now it is time to create an interactive term between Obstacle Name and Season inorder to predict the count. We can then analyze the summary and see if there is any trend causing us to go against our null hypothesis.

In [47]:
model = ols(formula='count ~ ObstacleName*Season', data=frame).fit()
print(model.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  count   R-squared:                       0.478
Model:                            OLS   Adj. R-squared:                  0.352
Method:                 Least Squares   F-statistic:                     3.802
Date:                Sun, 15 May 2022   Prob (F-statistic):           8.29e-07
Time:                        15:58:12   Log-Likelihood:                -306.34
No. Observations:                 130   AIC:                             664.7
Df Residuals:                     104   BIC:                             739.2
Df Model:                          25                                         
Covariance Type:            nonrobust                                         
===========================================================================================================
                                              coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------------------------------
Intercept                                   2.1333      1.950      1.094      0.277      -1.734       6.001
ObstacleName[T.Floating Steps]             -5.6000      2.758     -2.030      0.045     -11.070      -0.130
ObstacleName[T.Invisible Ladder]           -2.4667      2.758     -0.894      0.373      -7.936       3.003
ObstacleName[T.Jump Hang]                   0.9333      2.758      0.338      0.736      -4.536       6.403
ObstacleName[T.Jumping Spider]              0.3333      2.758      0.121      0.904      -5.136       5.803
ObstacleName[T.Log Grip]                    1.2000      2.758      0.435      0.664      -4.270       6.670
ObstacleName[T.Quad Steps]                  1.4667      2.758      0.532      0.596      -4.003       6.936
ObstacleName[T.Quintuple Steps]             0.4000      2.758      0.145      0.885      -5.070       5.870
ObstacleName[T.Rolling Log]                -1.9333      2.758     -0.701      0.485      -7.403       3.536
ObstacleName[T.Rope Ladder]                -1.1333      2.758     -0.411      0.682      -6.603       4.336
ObstacleName[T.Salmon Ladder]              -1.4000      2.758     -0.508      0.613      -6.870       4.070
ObstacleName[T.Wall Lift]                   0.0667      2.758      0.024      0.981      -5.403       5.536
ObstacleName[T.Warped Wall]                 1.5333      2.758      0.556      0.579      -3.936       7.003
Season                                     -0.2061      0.314     -0.656      0.514      -0.829       0.417
ObstacleName[T.Floating Steps]:Season       1.3455      0.445      3.027      0.003       0.464       2.227
ObstacleName[T.Invisible Ladder]:Season     0.4667      0.445      1.050      0.296      -0.415       1.348
ObstacleName[T.Jump Hang]:Season           -0.0242      0.445     -0.055      0.957      -0.906       0.857
ObstacleName[T.Jumping Spider]:Season       0.0121      0.445      0.027      0.978      -0.869       0.894
ObstacleName[T.Log Grip]:Season            -0.0182      0.445     -0.041      0.967      -0.900       0.863
ObstacleName[T.Quad Steps]:Season          -0.1576      0.445     -0.354      0.724      -1.039       0.724
ObstacleName[T.Quintuple Steps]:Season      0.3273      0.445      0.736      0.463      -0.554       1.209
ObstacleName[T.Rolling Log]:Season          0.3697      0.445      0.832      0.408      -0.512       1.251
ObstacleName[T.Rope Ladder]:Season          0.2061      0.445      0.464      0.644      -0.675       1.088
ObstacleName[T.Salmon Ladder]:Season        0.8182      0.445      1.841      0.069      -0.063       1.700
ObstacleName[T.Wall Lift]:Season            0.0061      0.445      0.014      0.989      -0.875       0.888
ObstacleName[T.Warped Wall]:Season          1.1030      0.445      2.481      0.015       0.222       1.985
==============================================================================
Omnibus:                       58.305   Durbin-Watson:                   1.513
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              174.381
Skew:                           1.729   Prob(JB):                     1.36e-38
Kurtosis:                       7.498   Cond. No.                         191.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

What this is telling us above is all of the different elements to each individual regression line. It tells us the coeficients, intercepts, and p-value which is what we are interested in. The base-coeficient represents the bridge of blades. We can then add this to the rest of the coeficients listed to get the slope of the regression line for that individual obstacle. In the code below we will be grabbing the elements from the summary chart and analyzing the coeficients. If we see a negative value, this will tell us that overall as seasons go on, the count of that obstacle is decreasing at that rate, and vise versa for positive value. This is then all with respect to the pvalue also shown in the chart below. A p-value closer to 0 for the regression line means that the points are very close to the line. This tells us that the line is significant and does a great job of representing what is actually happening. A p-value further than 0 shows us that this is not very significant. Lets take a look below at what we have going on in the regressional summary.

In [48]:
coef = model.params[13:]
inter = model.params[:13]
basein = inter[0]
baseco = coef[0]

print("Bridge of Blades -> ", baseco)
print("Floating Steps -> ", baseco+coef[1])
print("Invisible Ladder -> ", baseco+coef[2])
print("Jump Hang -> ", baseco+coef[3])
print("Jumping Spider -> ", baseco+coef[4])
print("Log Grip -> ", baseco+coef[5])
print("Quad Steps -> ", baseco+coef[6])
print("Quintuple Steps -> ", baseco+coef[7])
print("Rolling Log -> ", baseco+coef[8])
print("Rope Ladder -> ", baseco+coef[9])
print("Salmon Ladder -> ", baseco+coef[10])
print("Wall Lift -> ", baseco+coef[11])
print("Warped Wall -> ", baseco+coef[12])
Bridge of Blades ->  -0.20606060606060456
Floating Steps ->  1.139393939393939
Invisible Ladder ->  0.26060606060606056
Jump Hang ->  -0.2303030303030278
Jumping Spider ->  -0.19393939393939089
Log Grip ->  -0.2242424242424214
Quad Steps ->  -0.3636363636363633
Quintuple Steps ->  0.1212121212121216
Rolling Log ->  0.1636363636363623
Rope Ladder ->  1.0547118733938987e-15
Salmon Ladder ->  0.6121212121212118
Wall Lift ->  -0.199999999999998
Warped Wall ->  0.896969696969699

After pulling the coeficients and analyzing the p-values, we can clearly see which Obstacles have what kind of trend and what we might expect to see. When looking at the correaltions, we can clearly see that the Bridge of Blades, Jump Hang, Jumping Spider, Log Grips, Quad Steps, and Wall Lift seem to have a negative correlation (if any). We can also see that the Flaoting Steps, Invisible Laddar, Quintuple Steps, Rolling Log, Rope Ladder, Salmon Ladder, and Warped Wall seem to have a positive correlation (if any). Now to see if this correlation means anything, we have to look at the p value. For almost all of the obstacles (except for Floating Steps, Salmon Ladder, and Warped Wall) there is a p-value that is far from zero. This tells us that all parameters except for these three are insignificant. This means that there is not enough of a correlation to reject the null hypothesis for the insignificant obstacles. Although there is enough cause to reject the null hypothesis for the Warped Wall, Floating Steps, and Salmon Ladder. While this does give us a lot of insight into obstacles speficially, we see that only 3/13 of these obstacles are significant, which is not enough to reject our null hypothesis that there is no correlation between later seasons and the count of relevant obstacles.

In conclusion, if we were to go on American Ninja Warrior, we would definitely want to practice the Warped Wall, FLoating Steps, and Salmon Ladder. But after these obstacle we would be at a loss of what we should practice.